Rationale and Research Questions

The transition to renewable energy is a critical issue for mitigating climate change and promoting sustainable development. As the world’s second-largest carbon emitter (https://www.epa.gov/ghgemissions/global-greenhouse-gas-emissions-data), the United States has set ambitious targets for reducing greenhouse gas emissions and increasing renewable energy generation. However, the factors that drive renewable energy adoption in the US are not fully understood, and there is a need for more research to identify the most effective policies and incentives for promoting clean energy.

In this project, we aim to investigate the key drivers of renewable energy adoption in the US and their relative importance. We will use a comprehensive dataset derived from public-facing websites which includes information on state-level renewable energy capacity, energy policies, economic indicators, and demographic variables.

Our research questions include: 1. How do economic indicators, such as energy prices, GDP, and oil production, influence renewable energy adoption in different regions of the US? 2. What role do demographic variables, such as IQ and political affiliation, play in shaping attitudes towards renewable energy and driving adoption?

Project Setup

knitr::opts_chunk$set(echo = TRUE)
#1
#Install familiar packages
library(tidyverse);library(lubridate);library(viridis);library(here)

#install.packages("rvest")
library(rvest)

#install.packages("dataRetrieval")
library(dataRetrieval)

#install.packages("tidycensus")
library(tidycensus)
library(dplyr)
library(rvest)
library(ggplot2)

#install.packages("corrplot")
library(corrplot)
library(stringr)

library(sf)
library(leaflet)
library(mapview); mapviewOptions(fgb = FALSE)
sf::sf_use_s2(FALSE)

Dataset information

The dataset for this analysis was collected from various sources including the National Renewable Energy Laboratory (NREL), the United States Energy Information Administration (EIA), and wisevoter.com. The dataset includes information on various factors that may drive renewable energy adoption in the US, such as state-level policies, economic indicators, demographic characteristics, and natural resources.

The dataset was first imported into R as separate data frames and then merged into a single data frame using the merge() function. Before merging, data cleaning and manipulation were performed to ensure consistency across variables and eliminate missing values. For instance, some variables were renamed, and missing values were replaced with appropriate values or removed entirely.

The final dataset contains 50 observations (one for each state) and 7 variables. The variables in the dataset are summarized in the table below:

Variable Unit Range / Central Tendency Data Source
Good_Air_Quality_Days_Percentage Percentage 35.8-89.4 wisevoter.com
Oil_Production Thousand barrels per day 0-4700 wisevoter.com
GDP Billion dollars 54.7-2,784.1 wisevoter.com
Sunshine Hours per day 3.7-10.3 wisevoter.com
Electricity_Cost Cents per kilowatt hour 5.6-20.5 wisevoter.com
Average_IQ IQ points 94.8-103.9 wisevoter.com
Republican_vote Percentage 23.5-70.4 wisevoter.com

Data Scraping

url_aq <- "https://wisevoter.com/state-rankings/air-quality-by-state/"
page1 <- read_html(url_aq)

states <- page1 %>% 
  html_nodes("td:nth-child(2)") %>% 
  html_text()

good.air.days <- page1 %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_aq <- data.frame(
  States = states,
  Good_Air_Quality_Days_Percentage = good.air.days
)
url_oil <- "https://wisevoter.com/state-rankings/oil-production-by-state/"
page2 <- read_html(url_oil)

states2 <- page2 %>% 
  html_nodes("td:nth-child(2)") %>% 
  html_text()

oil.production <- page2 %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_oil <- data.frame(
  States = states2,
  Oil_Production = oil.production
)
url_gdp <- "https://wisevoter.com/state-rankings/gdp-by-state/"
page <- read_html(url_gdp)

states3 <- page %>% 
  html_nodes("td:nth-child(2)") %>% 
  html_text()

gdp <- page %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_gdp <- data.frame(
  States = states3,
  GDP = gdp
)
url_sunny <- "https://wisevoter.com/state-rankings/sunniest-states/"
page <- read_html(url_sunny)

states4 <- page %>% 
  html_nodes("td:nth-child(2)") %>% 
  html_text()

sunshine <- page %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_sunny <- data.frame(
  States = states4,
  Sunshine = sunshine
)
url_elec <- "https://wisevoter.com/state-rankings/electricity-cost-by-state/"
page <- read_html(url_elec)

states5 <- page %>% 
  html_nodes("td:nth-child(2)") %>% 
  html_text()

cost <- page %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_ecost <- data.frame(
  States = states5,
  Electricity_Cost = cost
)
url_iq <- "https://wisevoter.com/state-rankings/average-iq-by-state/"
page <- read_html(url_iq)

states6 <- page %>% 
  html_nodes("td:nth-child(2)") %>% 
  html_text()

iq <- page %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_iq <- data.frame(
  States = states6,
  Average_IQ = iq
)
url_party <- "https://wisevoter.com/state-rankings/red-and-blue-states"
page <- read_html(url_party)

states7 <- page %>% 
  html_nodes("td:nth-child(1)") %>% 
  html_text()

rep_vote <- page %>% 
  html_nodes("td:nth-child(3)") %>% 
  html_text()

df_party <- data.frame(
  States = states7,
  Republican_vote = rep_vote
)
# Merge data frames
state_rankings_df_merge <- Reduce(function(x, y) merge(x, y, by = "States", all = TRUE), 
                   list(df_aq, df_oil, df_gdp, df_sunny, df_ecost, df_iq, df_party))

# View merged data frame
head(state_rankings_df_merge)
##       States Good_Air_Quality_Days_Percentage Oil_Production                GDP
## 1    Alabama                            85.8%       425 Mbbl   $257,465,000,000
## 2     Alaska                            92.3%   159,623 Mbbl    $57,983,000,000
## 3    Arizona                            64.6%         6 Mbbl   $429,819,000,000
## 4   Arkansas                              83%     4,179 Mbbl   $150,483,000,000
## 5 California                            60.6%   130,593 Mbbl $3,513,348,000,000
## 6   Colorado                            75.4%   144,621 Mbbl   $440,903,000,000
##      Sunshine Electricity_Cost Average_IQ Republican_vote
## 1 4,660 kJ/m²    $0.1  per kWh       95.7             62%
## 2        <NA>    $0.2  per kWh         99           52.8%
## 3 5,755 kJ/m²   $0.11  per kWh       97.4           49.1%
## 4 4,724 kJ/m²   $0.09  per kWh       97.5           62.4%
## 5 5,050 kJ/m²    $0.2  per kWh       95.5           34.3%
## 6 4,960 kJ/m²   $0.11  per kWh      101.6           41.9%
write.csv(state_rankings_df_merge, file = here("data_processed/state rankings_raw.csv"),
          row.names = FALSE)

Data Cleaning

Next, we clean the data scraped from various websites before merging it with the power plant data. The dataframe we have right now consisted of columns made of chars. After removing the units (such as per kWh) and characters (such as comma and $), we converted the chars to numeric values and created additional columns as needed based on existing columns.

# read in raw data
state_ranking <- read.csv(here("data_processed/state rankings_raw.csv"), header = TRUE)
# save a copy of raw data before making changes
state_ranking_raw <- state_ranking

# remove % in good_air_q col
state_ranking$Good_Air_Quality_Days_Percentage <- str_sub(state_ranking$Good_Air_Quality_Days_Percentage, end = -2)
# change chr to numeric
state_ranking$Good_Air_Quality_Days_Percentage <- as.numeric(state_ranking$Good_Air_Quality_Days_Percentage)

# remove unit in oil_production
state_ranking$Oil_Production <- str_sub(state_ranking$Oil_Production, end = -5)
# remove comma 
state_ranking$Oil_Production <- gsub(',','', state_ranking$Oil_Production)
# change chr to numeric
state_ranking$Oil_Production <- as.numeric(state_ranking$Oil_Production)

# remove $ in GDP
state_ranking$GDP <- str_sub(state_ranking$GDP, 2)
# remove comma
state_ranking$GDP <- gsub(',','', state_ranking$GDP)
# change to numeric
state_ranking$GDP <- as.numeric(state_ranking$GDP)

# remove unit in sunshine
state_ranking$Sunshine <- str_sub(state_ranking$Sunshine, end=-6)
# remove comma
state_ranking$Sunshine <- gsub(',','', state_ranking$Sunshine)
# change to numeric
state_ranking$Sunshine <- as.numeric(state_ranking$Sunshine)

# clean electricity_cost
state_ranking$Electricity_Cost <- str_sub(state_ranking$Electricity_Cost, end=-8)
state_ranking$Electricity_Cost <- str_sub(state_ranking$Electricity_Cost, 2)
state_ranking$Electricity_Cost <- as.numeric(state_ranking$Electricity_Cost)

# clean average
state_ranking$Average_IQ <- as.numeric(state_ranking$Average_IQ)

# clean republican_vote
state_ranking$Republican_vote <- str_sub(state_ranking$Republican_vote, end=-2)
state_ranking$Republican_vote <- as.numeric(state_ranking$Republican_vote)

# create new column: party affiliation
state_ranking$party <- with(state_ranking, ifelse(Republican_vote > 50, "Red", "Blue"))

For the response variable, clean energy adoption rate, data wrangling and calculation are needed. We first transform all NAs to 0 so that we can later aggregate the capacity for each energy source. The next step is to calculate the total clean energy capacity and total energy generation capacity for each power plant. After that, we aggregate all the power plants by states and calculate the clean energy percentage for each state.

Power_plant <- read.csv(here("data_raw/Power_Plants.csv"), stringsAsFactors = TRUE)

Power_plant[is.na(Power_plant)] <- 0

Power_plant <- Power_plant %>%
  mutate(Total_clean = Bio_MW + Geo_MW + Hydro_MW + HydroPS_MW + Nuclear_MW + Solar_MW + Wind_MW) %>%
  mutate(Total = Total_clean + Coal_MW + NG_MW + Crude_MW)

Clean_energy <- aggregate(cbind(Power_plant$Total_clean, Power_plant$Total), list(Power_plant$State), FUN = sum)
colnames(Clean_energy) <- c("States", "Clean_energy_MW", "Total_MW")
Clean_energy$Clean_percent <- Clean_energy$Clean_energy_MW/Clean_energy$Total_MW

We merge the two clean datasets as one final dataset that is ready for analysis.

final_df <- merge(Clean_energy, state_ranking)

Exploratory Analysis

Correlation plots

We plot correlation plots to explore the covariance among response and explanatory variables.

final_naomit <- final_df %>%
  select(Clean_percent:Republican_vote) %>%
  na.omit()
final_corr <- cor(final_naomit)
corrplot.mixed(final_corr, upper = "ellipse", tl.cex = 0.6)

Histogram

We want to take a deeper dive into the distribution of each variables by histograms. Based on the histograms, oil production, GDP and electricity cost are skewed, which indicates that transformation is needed for those variables.

par(mfrow=c(2,4))
hist(final_naomit$Clean_percent, xlab = "Clean Energy Adoption %", main = NULL)
hist(final_naomit$Good_Air_Quality_Days_Percentage, xlab = "Good Air Quality Days %", main = NULL)
hist(final_naomit$Oil_Production, xlab = "Good Air Quality Days %", main = NULL)
hist(final_naomit$GDP, xlab = "GDP", main = NULL)
hist(final_naomit$Sunshine, xlab = "Sunshine", main = NULL)
hist(final_naomit$Electricity_Cost, xlab = "Electricity Cost", main = NULL)
hist(final_naomit$Average_IQ, xlab = "Average IQ", main = NULL)
hist(final_naomit$Republican_vote, xlab = "Republican Vote", main = NULL)

Map

Next, we visualize our power plants data by making an interactive map widget with the leaflet package. We first categorized the power plant as “clean energy” or “other” based on the “primary source” attribute - we considered biomass, geothermal, hydroelectric, pumped storage, nuclear, solar, and wind to be clean. For the map, we plotted clean energy power plants as blue and other as red; the size of the markers was scaled to indicate the maximum output and was expressed in megawatts in original data. You can pan around and zoom in and out to explore the dataset.

# read in the shapefile
powerp <- st_read('../LiuSmootZhang_ENV872_EDA_FinalProject/data_raw/Power_Plants.shp')
## Reading layer `Power_Plants' from data source 
##   `/home/guest/LiuSmootZhang_ENV872_EDA_FinalProject/data_raw/Power_Plants.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 11569 features and 32 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -171.7124 ymin: 17.94712 xmax: -65.27956 ymax: 71.292
## Geodetic CRS:  WGS 84
powerp$PrimSource <- as.factor(powerp$PrimSource)

unique(powerp$PrimSource)
##  [1] petroleum      hydroelectric  natural gas    nuclear        coal          
##  [6] pumped storage geothermal     wind           biomass        batteries     
## [11] other          solar         
## 12 Levels: batteries biomass coal geothermal hydroelectric ... wind
powerp$PrimSource <- str_trim(powerp$PrimSource, 'both')
# clean energy: biomass, geothermal, hydroelectric, pumped, nuclear, solar, wind
# create new column: clean energy
powerp$clean <- with(powerp,
                     ifelse(PrimSource=='biomass'|PrimSource=='geothermal'|PrimSource=='hydroelectric'|PrimSource=='pumpedstorage'|PrimSource=='nuclear'|PrimSource=='solar'|PrimSource=='wind', 'clean energy', 'others'))

powerp$clean <- as.factor(powerp$clean)

# colors for the map widget
pal <- colorFactor(c('skyblue','tomato'), powerp$clean)
# make the map widget
leaflet(powerp) %>% 
  addProviderTiles(providers$CartoDB.Positron)%>%  
  addCircleMarkers(
    radius = ~sqrt(Total_MW)/5,
    color = ~pal(clean),
    stroke = FALSE, fillOpacity = 0.5
  ) %>% 
  addLegend("bottomright", pal = pal, values = ~clean,
    title = "Power plants",
    opacity = 1
  )

We then mapped the clean energy percentage in each state. Thanks to the help from https://rstudio.github.io/leaflet/choropleths.html, we were able to make this interactive map widge. You can pan around, zoom, and hover your mouse over each state to see its clean enegery percentage and state name in a popup window. The state boundaries data came from https://www.sciencebase.gov/catalog/item/52c78623e4b060b9ebca5be5.

# read in the shapefile
states_shp <- st_read('../LiuSmootZhang_ENV872_EDA_FinalProject/data_raw/tl_2012_us_state.shp')
## Reading layer `tl_2012_us_state' from data source 
##   `/home/guest/LiuSmootZhang_ENV872_EDA_FinalProject/data_raw/tl_2012_us_state.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 56 features and 17 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -19951910 ymin: -1643352 xmax: 20021890 ymax: 11554790
## Projected CRS: Popular Visualisation CRS / Mercator
states_shp <- st_transform(states_shp, c=4326)

# join the dataframe with clean energy percentage by state to the state shapefile
join_states <- full_join(states_shp, final_df, by = c('NAME' = 'States'))

# visualize on map
bins <- c(NA, 0, 0.20, 0.40, 0.60, 0.80, 1.00)
pal2 <- colorBin("Greens", domain = join_states$Clean_percent, bins = bins)

labels <- sprintf(
  "<strong>%s</strong><br/>%g ",
  join_states$NAME, join_states$Clean_percent) %>% lapply(htmltools::HTML)

leaflet(join_states) %>%
  setView(-96, 37.8, 3) %>%
  addPolygons(
    fillColor = ~pal2(Clean_percent),
    weight = 2,
    opacity = 1,
    color = "white",
    dashArray = "3",
    fillOpacity = 0.7,
    highlightOptions = highlightOptions(
      weight = 5,
      color = "#666",
      dashArray = "",
      fillOpacity = 0.7,
      bringToFront = TRUE),
    label = labels,
    labelOptions = labelOptions(
      style = list("font-weight" = "normal", padding = "3px 8px"),
      textsize = "15px",
      direction = "auto")) %>%
  addLegend(pal = pal2, values = ~Clean_percent, opacity = 0.7, title = NULL,
    position = "bottomright")

Analysis

Regression Analysis

Given the skewed distribution of certain variables, we use the log-transformed versions in the Generalized Linear Model (GLM). Out of the concern of over-parameterize a linear model, we use the Akaike’s Information Criterion (AIC) to compute a stepwise regression that removes explanatory variables from a full set of suggested options.

We feed all potential explanatory variables into the full model and use step() function to decide the optimal model. Based on the stepwise regression, we narrow down the explanatory variables to air quality, logged GDP, average IQ and Republican vote.

TPAIC <- lm(data = final_naomit, Clean_percent ~ Good_Air_Quality_Days_Percentage + log(Oil_Production) + log(GDP) + Sunshine + log(Electricity_Cost) + Average_IQ + Republican_vote)

#Choose a model by AIC in a Stepwise Algorithm
step(TPAIC)
## Start:  AIC=-108.61
## Clean_percent ~ Good_Air_Quality_Days_Percentage + log(Oil_Production) + 
##     log(GDP) + Sunshine + log(Electricity_Cost) + Average_IQ + 
##     Republican_vote
## 
##                                    Df Sum of Sq     RSS     AIC
## - Sunshine                          1  0.007797 0.56449 -110.18
## - log(Oil_Production)               1  0.009092 0.56578 -110.11
## - log(Electricity_Cost)             1  0.025375 0.58206 -109.23
## - Republican_vote                   1  0.035648 0.59234 -108.69
## <none>                                          0.55669 -108.61
## - Good_Air_Quality_Days_Percentage  1  0.039261 0.59595 -108.50
## - Average_IQ                        1  0.093106 0.64979 -105.82
## - log(GDP)                          1  0.157498 0.71419 -102.89
## 
## Step:  AIC=-110.18
## Clean_percent ~ Good_Air_Quality_Days_Percentage + log(Oil_Production) + 
##     log(GDP) + log(Electricity_Cost) + Average_IQ + Republican_vote
## 
##                                    Df Sum of Sq     RSS     AIC
## - log(Oil_Production)               1  0.008197 0.57268 -111.73
## - log(Electricity_Cost)             1  0.018398 0.58288 -111.19
## <none>                                          0.56449 -110.18
## - Republican_vote                   1  0.048844 0.61333 -109.61
## - Good_Air_Quality_Days_Percentage  1  0.074433 0.63892 -108.34
## - Average_IQ                        1  0.101302 0.66579 -107.06
## - log(GDP)                          1  0.159932 0.72442 -104.45
## 
## Step:  AIC=-111.73
## Clean_percent ~ Good_Air_Quality_Days_Percentage + log(GDP) + 
##     log(Electricity_Cost) + Average_IQ + Republican_vote
## 
##                                    Df Sum of Sq     RSS     AIC
## - log(Electricity_Cost)             1  0.017277 0.58996 -112.81
## <none>                                          0.57268 -111.73
## - Republican_vote                   1  0.054567 0.62725 -110.91
## - Good_Air_Quality_Days_Percentage  1  0.066666 0.63935 -110.32
## - Average_IQ                        1  0.097791 0.67047 -108.85
## - log(GDP)                          1  0.160578 0.73326 -106.07
## 
## Step:  AIC=-112.81
## Clean_percent ~ Good_Air_Quality_Days_Percentage + log(GDP) + 
##     Average_IQ + Republican_vote
## 
##                                    Df Sum of Sq     RSS     AIC
## <none>                                          0.58996 -112.81
## - Good_Air_Quality_Days_Percentage  1  0.075281 0.66524 -111.09
## - Average_IQ                        1  0.090956 0.68092 -110.37
## - Republican_vote                   1  0.097617 0.68758 -110.07
## - log(GDP)                          1  0.144389 0.73435 -108.03
## 
## Call:
## lm(formula = Clean_percent ~ Good_Air_Quality_Days_Percentage + 
##     log(GDP) + Average_IQ + Republican_vote, data = final_naomit)
## 
## Coefficients:
##                      (Intercept)  Good_Air_Quality_Days_Percentage  
##                         1.844051                         -0.007738  
##                         log(GDP)                        Average_IQ  
##                        -0.091895                          0.020877  
##                  Republican_vote  
##                        -0.009484
TPmodel <- lm(data = final_naomit, Clean_percent ~ Good_Air_Quality_Days_Percentage + log(GDP) + Average_IQ + Republican_vote)
summary(TPmodel)
## 
## Call:
## lm(formula = Clean_percent ~ Good_Air_Quality_Days_Percentage + 
##     log(GDP) + Average_IQ + Republican_vote, data = final_naomit)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.21917 -0.09577  0.00624  0.08894  0.35866 
## 
## Coefficients:
##                                   Estimate Std. Error t value Pr(>|t|)  
## (Intercept)                       1.844051   1.591740   1.159   0.2572  
## Good_Air_Quality_Days_Percentage -0.007738   0.004248  -1.821   0.0801 .
## log(GDP)                         -0.091895   0.036429  -2.523   0.0181 *
## Average_IQ                        0.020877   0.010428   2.002   0.0558 .
## Republican_vote                  -0.009484   0.004572  -2.074   0.0481 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1506 on 26 degrees of freedom
## Multiple R-squared:  0.3834, Adjusted R-squared:  0.2885 
## F-statistic: 4.041 on 4 and 26 DF,  p-value: 0.01115

The coefficients table provides the estimated coefficients for each explanatory variable, which indicate the expected change in the response variable for a one-unit increase in the explanatory variable while holding other variables constant. For example, a one-unit increase in the log(GDP) variable is associated with a 0.0919 decrease in the Clean_percent variable while holding the other variables constant.

The p-values in the table indicate the level of statistical significance for each variable. A p-value less than the significance level of 0.05 suggests that the variable is statistically significant in explaining the variation in the response variable. In this model, log(GDP) and Republican_vote are statistically significant, while Average_IQ is marginally significant at the 0.05 level.

The Adjusted R-squared value of 0.2885 indicates that the model explains approximately 28.85% of the variation in the response variable after adjusting for the number of explanatory variables. The F-statistic of 4.041 with a p-value of 0.01115 suggests that at least one of the explanatory variables is significant in explaining the response variable. The Residual standard error of 0.1506 indicates the variation in the response variable that cannot be explained by the model.]

par(mfrow = c(2,2), mar=c(4,4,4,4))
plot(TPmodel)

par(mfrow = c(1,1))

Based on the above plots, there are no outliners that significantly impact the regression result.

T Test

party <- t.test(final_df$Clean_percent ~ final_df$party)
party
## 
##  Welch Two Sample t-test
## 
## data:  final_df$Clean_percent by final_df$party
## t = 1.0798, df = 47.659, p-value = 0.2857
## alternative hypothesis: true difference in means between group Blue and group Red is not equal to 0
## 95 percent confidence interval:
##  -0.05246162  0.17412949
## sample estimates:
## mean in group Blue  mean in group Red 
##          0.4071395          0.3463056

Based on the T-test result, the clean energy adoption rate for Red and Blue states are not statistically different. The test statistic t is 1.0798 with degrees of freedom (df) of 47.659 and a p-value of 0.2857. The p-value is greater than the significance level of 0.05, so we fail to reject the null hypothesis of equality. This means that we do not have enough evidence to conclude that there is a significant difference in the mean percentage of clean energy adoption between Blue and Red states.

ggplot(final_df, aes(y=Clean_percent, x=party, fill = party)) +
  geom_boxplot()+
  labs(title = "Clean Energy Adoption Rate by Red or Blue State", y="Clean Energy Percentage (%)", x="Party", fill ="Party")+
  scale_fill_manual(values=c("skyblue", "tomato"))

Conclusion

In this study, we sought to investigate the relationship between various demographic, economic, and political factors with the adoption of clean energy in the US states. Our analysis of the data revealed some interesting findings that can help us better understand this relationship.

Our regression analysis showed that state GDP and Republican vote are significant predictors of the percentage of clean energy adoption in the state. Average IQ is a marginally significant predictor of the percentage of clean energy adoption in the state. Of these, GDP and Republican vote had a negative relationship with clean energy adoption, meaning that as these variables increased, the percentage of clean energy adoption decreased. In contrast, average IQ had a positive relationship with clean energy adoption, suggesting that states with higher average IQ tend to have a higher percentage of clean energy adoption.

While the negative relationship between GDP and clean energy adoption may seem counterintuitive, it may reflect the fact that high GDP states have a stronger presence of energy-intensive industries, which may be resistant to clean energy adoption. Additionally, the negative relationship with Republican vote suggests that political ideology may also play a role in clean energy adoption, with Republican-leaning states being less likely to adopt clean energy.

Our t-test revealed that there is no statistically significant difference between the mean percentage of clean energy adoption in Blue and Red states. While the mean percentage of clean energy adoption was higher in Blue states, this difference was not statistically significant. However, this finding should be interpreted with caution, as our analysis did not control for other factors that may influence clean energy adoption.

Overall, our findings suggest that demographic, economic, and political factors are all important predictors of clean energy adoption in the US states. States with better air quality and higher average IQ tend to have a higher percentage of clean energy adoption, while states with higher GDP and more Republican-leaning voters tend to have a lower percentage of clean energy adoption. These findings may have important implications for policymakers seeking to promote clean energy adoption in the US and can serve as a starting point for further research in this area.